64 research outputs found

    Arquitectura de datos avanzada de un directorio web, con optimización de consultas restringidas a una zona del grafo de categorías

    Get PDF
    [Resumen] Desde su origen, el World Wide Web ha sufrido un crecimiento exponencial que ha generado un gran volumen de información heterogénea accesible para cualquier usuario, Esto ha llevado a la utilización de herramientas eficientes para gestionar, recuperar y filtrar dicha información. En concreto, los directorios Web son taxonomías que clasifican documentos web, sobre los que posteriormente se realizarán consultas. Este tipo de sistemas de recuperación de información presenta un tipo específico de búsquedas, en donde la colección de documentos está restringida a una zona del grafo de categorías. Esta disertación presenta una arquitectura de datos específica para directorios Web que permite mejorar el rendimiento ante búsquedas restringidas. Dicha arquitectura se basa en una estructura de datos híbrida, constituida por un fichero invertido conteniendo embebido múltiples ficheros de firmas. En base al modelo propuesto se definen dos variantes: la arquitectura híbrida con información total y la arquitectura híbrida con información parcial. La valiez de esta arquitectura ha sido analizada mediante el desarrollo de ambas variantes para su comparación con un modelo básico, demostrando una clara mejoría en el rendimiento de las consultas restringidas, destacando especialmente el modelo híbrido con información parcial al responder adecuadamente bajo cualquier carga del sistema de búsqueda. A nivel general, la arquitectura propuesta se caracteriza por su facilidad de implementación, derivada de las estructuras de datos empleadas, su flexibilidad respecto al crecimiento del sistema y especialmente, por el buen rendimiento ofrecido ante búsquedas restringidas

    Queuing theory-based latency/power tradeoff models for replicated search engines

    Get PDF
    Large-scale search engines are built upon huge infrastructures involving thousands of computers in order to achieve fast response times. In contrast, the energy consumed (and hence the financial cost) is also high, leading to environmental damage. This paper proposes new approaches to increase energy and financial savings in large-scale search engines, while maintaining good query response times. We aim to improve current state-of-the-art models used for balancing power and latency, by integrating new advanced features. On one hand, we propose to improve the power savings by completely powering down the query servers that are not necessary when the load of the system is low. Besides, we consider energy rates into the model formulation. On the other hand, we focus on how to accurately estimate the latency of the whole system by means of Queueing Theory. Experiments using actual query logs attest the high energy (and financial) savings regarding current baselines. To the best of our knowledge, this is the first paper in successfully applying stationary Queueing Theory models to estimate the latency in a large-scale search engine

    Early Detection of Depression: Social Network Analysis and Random Forest Techniques

    Get PDF
    [Abstract] Background: Major depressive disorder (MDD) or depression is among the most prevalent psychiatric disorders, affecting more than 300 million people globally. Early detection is critical for rapid intervention, which can potentially reduce the escalation of the disorder. Objective: This study used data from social media networks to explore various methods of early detection of MDDs based on machine learning. We performed a thorough analysis of the dataset to characterize the subjects’ behavior based on different aspects of their writings: textual spreading, time gap, and time span. Methods: We proposed 2 different approaches based on machine learning singleton and dual. The former uses 1 random forest (RF) classifier with 2 threshold functions, whereas the latter uses 2 independent RF classifiers, one to detect depressed subjects and another to identify nondepressed individuals. In both cases, features are defined from textual, semantic, and writing similarities. Results: The evaluation follows a time-aware approach that rewards early detections and penalizes late detections. The results show how a dual model performs significantly better than the singleton model and is able to improve current state-of-the-art detection models by more than 10%. Conclusions: Given the results, we consider that this study can help in the development of new solutions to deal with the early detection of depression on social networks.Ministerio de Economía y Competitividad; TIN2015-70648-PXunta de Galicia; ED431G/01 2016-201

    Un concurso de cortos para el refuerzo pedagógico y la mejora de la participación del alumnado

    Get PDF
    [Resumen] En la asignatura de Redes del Grado en Ingeniería Informática de la Universidade da Coruña se explican los fundamentos de la comunicación a través de una red de computadores. Para incentivar la participación del alumnado e incrementar su motivación se ha propuesto un concurso de cortometrajes. Se busca que el alumno sea un protagonista activo del aprendizaje, en clara sintonía con el propósito de la reforma educativa actual. El objetivo de la actividad es que el alumno cree un vídeo de un máximo de 3 minutos de duración en el que explique un concepto. Posteriormente, se realiza una evaluación en base a una rúbrica. Varios alumnos y profesores juzgan cada vídeo de tal manera que los evaluados no conocen a sus evaluadores. El beneficio de la actividad es doble: los estudiantes que preparan los vídeos deben estudiar el material, y los estudiantes que ven los vídeos aprenden de un modo más informal y divertido. Para conseguir más retroalimentación, se han proporcionado encuestas a los alumnos, y los resultados han sido muy positivos. Además, se han conseguido vídeos de buena calidad, que se pueden utilizar como material docente.[Abstract] In the subject of Networks of the Degree in Computer Engineering of the University of A Coruna, the fundamentals of communication through a network of computers are explained. A short film contest has been proposed in order to foster the participation and increase the motivation of the students. It is intended that the student is the protagonist, in clear harmony with the purpose of the current educational reform. The aim of this activity is to explain a concept in a video 3 minutes long as maximum. Then, the videos are evaluated by several students and teachers using a rubric according to a blind evaluation. The benefits of this activity are two: students who prepare the videos must understand the concepts, and students who watch the videos learn in a more informal and funny way. A survey has been provided to the students and the results have been very positive. Moreover, good-quality videos have been obtained, so they can be employed as teaching material

    High Order Profile Expansion to tackle the new user problem on recommender systems

    Get PDF
    Data Availability: The complete dataset for the High Order Profile Expansion experiments has been published in the public repository: https://doi.org/10.6084/m9.figshare.9798155.[Abstract] Collaborative Filtering algorithms provide users with recommendations based on their opinions, that is, on the ratings given by the user for some items. They are the most popular and widely implemented algorithms in Recommender Systems, especially in e-commerce, considering their good results. However, when the information is extremely sparse, independently of the domain nature, they do not present such good results. In particular, it is difficult to offer recommendations which are accurate enough to a user who has just arrived to a system or who has rated few items. This is the well-known new user problem, a type of cold-start. Profile Expansion techniques had been already presented as a method to alleviate this situation. These techniques increase the size of the user profile, by obtaining information about user tastes in distinct ways. Therefore, recommender algorithms have more information at their disposal, and results improve. In this paper, we present the High Order Profile Expansion techniques, which combine in different ways the Profile Expansion methods. The results show 110% improvement in precision over the algorithm without Profile Expansion, and 10% improvement over Profile Expansion techniques.Ministerio de Economía y Competitividad; TIN2015-70648-PXunta de Galicia; ED431G/01 2016-201

    Annotated Dataset for Anomaly Detection in a Data Center with IoT Sensors

    Get PDF
    [Abstract] The relative simplicity of IoT networks extends service vulnerabilities and possibilities to different network failures exhibiting system weaknesses. Therefore, having a dataset with a sufficient number of samples, labeled and with a systematic analysis, is essential in order to understand how these networks behave and detect traffic anomalies. This work presents DAD: a complete and labeled IoT dataset containing a reproduction of certain real-world behaviors as seen from the network. To approximate the dataset to a real environment, the data were obtained from a physical data center, with temperature sensors based on NFC smart passive sensor technology. Having carried out different approaches, performing mathematical modeling using time series was finally chosen. The virtual infrastructure necessary for the creation of the dataset is formed by five virtual machines, a MQTT broker and four client nodes, each of them with four sensors of the refrigeration units connected to the internal IoT network. DAD presents a seven day network activity with three types of anomalies: duplication, interception and modification on the MQTT message, spread over 5 days. Finally, a feature description is performed, so it can be used for the application of the various techniques of prediction or automatic classification.This project was funded by the Accreditation, Structuring, and Improvement of Consolidated Research Units and Singular Centers (ED431G/01), funded by Vocational Training of the Xunta de Galicia endowed with EU FEDER funds. This research was partially supported by the Ministry of Science and Innovation, Spain’s National Research and Development Plan, through the PID2019-111388GB-I00 projectXunta de Galicia; ED431G/0

    Time-Aware Detection Systems

    Get PDF
    [Abstract] Communication network data has been growing in the last decades and with the generalisation of the Internet of Things (IoT) its growth has increased. The number of attacks to this kind of infrastructures have also increased due to the relevance they are gaining. As a result, it is vital to guarantee an adequate level of security and to detect threats as soon as possible. Classical methods emphasise in detection but not taking into account the number of records needed to successfully identify an attack. To achieve this, time-aware techniques both for detection and measure may be used. In this work, well-known machine learning methods will be explored to detect attacks based on public datasets. In order to obtain the performance, classic metrics will be used but also the number of elements processed will be taken into account in order to determine a time-aware performance of the method.Ministero de Economía y Competitividad; TIN2015-70648-PXunta de Galicia; ED431G/01 2016-201

    Early Detection of Cyberbullying on Social Media Networks

    Get PDF
    [Abstract] Cyberbullying is an important issue for our society and has a major negative effect on the victims, that can be highly damaging due to the frequency and high propagation provided by Information Technologies. Therefore, the early detection of cyberbullying in social networks becomes crucial to mitigate the impact on the victims. In this article, we aim to explore different approaches that take into account the time in the detection of cyberbullying in social networks. We follow a supervised learning method with two different specific early detection models, named threshold and dual. The former follows a more simple approach, while the latter requires two machine learning models. To the best of our knowledge, this is the first attempt to investigate the early detection of cyberbullying. We propose two groups of features and two early detection methods, specifically designed for this problem. We conduct an extensive evaluation using a real world dataset, following a time-aware evaluation that penalizes late detections. Our results show how we can improve baseline detection models up to 42%.This research was supported by the Ministry of Economy and Competitiveness of Spain and FEDER funds of the European Union (Project PID2019-111388GB-I00) and by the Centro de Investigación de Galicia “CITIC”, funded by Xunta de Galicia (Galicia, Spain) and the European Union (European Regional Development Fund — Galicia 2014–2020 Program) , by grant ED431G 2019/01Xunta de Galicia; ED431G 2019/0

    Vídeos cortos realizados por los alumnos como recurso docente. Diferentes enfoques

    Full text link
    [ES] Este trabajo presenta el desarrollo y análisis de los resultados de un proyecto de innovación docente basado en la creación por parte de los alumnos de vídeos cortos, de 3 o 4 minutos de duración, donde deben exponer algún concepto o tema propuesto por el profesor. En esta actividad los alumnos trabajan las competencias digitales, búsqueda y síntesis de la información, comunicación, trabajo en grupo y por mediante el uso de la coevaluación la capacidad de crítica. La experiencia se ha desarrollado de forma conjunta con asignaturas y docentes de diversas universidades, titulaciones y cursos. A partir de unas pautas comunes, en cada asignatura se ha adaptado la experiencia al contexto específico de cada una, lo que permite diferentes enfoques.Este proyecto ha sido parcialmente financiado por la convocatoria de innovación docente de la Universidad de Zaragoza para los cursos 2015/16 y 2016/17 (Proyectos PIIDUZ_15_411 y PIIDUZ_16_070), Ministerio de Economía y Competitividad de España (Proyecto TIN2015-70648-P y TIN2015-64770-R), Fondo Social Europeo y Gobierno de Aragón (grupo de investigación reconocido T98).Azuara Guillén, G.; Fernández Iglesias, D.; López Torres, AM.; Salinas Baldellou, AM.; Aguilar Martín, MC.; Salazar Riaño, JL.; Fernández-Navajas, J.... (2018). Vídeos cortos realizados por los alumnos como recurso docente. Diferentes enfoques. En XIII Jornadas de Ingeniería telemática (JITEL 2017). Libro de actas. Editorial Universitat Politècnica de València. 348-355. https://doi.org/10.4995/JITEL2017.2017.6566OCS34835